Targeted Gene Metagenomic Data Analysis ◾ 289
qiime feature-table summarize \
--i-table dada2/table_feat_sample_freq_filtered_yoga_dada2.qza \
--m-sample-metadata-file data/sample-metadata.tsv \
--o-visualization dada2/table_feat_sample_freq_filtered_yoga_
dada2.qzv
qiime tools view dada2/table_feat_sample_freq_filtered_yoga_dada2.
qzv
You can use “qiime feature-table filter-samples --help” and “qiime feature-table filter-
features --help” to learn more about filtering parameters.
The steps of the preprocessing of raw data that we followed above will end up with
creation of feature tables and features (OTUs/ASVs) that we will rely on to move to the
downstream analysis (taxonomic classification, phylogenetic relationship, alpha and beta
diversity analysis, and differential abundance). There are always questions that come up at
this point: Which method is better (clustering or denoising)? Which clustering method is
the best (de novo, closed-reference, or open-reference clustering)? And, which denoising
method is better (DADA2 or deblur)? The right answer from many experts is that: try all of
them and adopt the one that works for you. There are number of articles that discussed the
pros and cons of each of these methods.
7.3.5 Taxonomic Assignment with QIIME2
As discussed above, one of the main goals of metagenomic studies is to identify the micro-
bial organisms present in a sample using any of the alignment-based classifiers or machine
learning classifiers. QIIME2 has alignment-based classifiers, machine learning classifiers,
and hybrid classifier methods contained in the “q2-feature-classifier” plugin. The alignment-
based classifier methods are “classify-consensus-blast” and “classify-consensus-vsearch”.
The machine learning classifier method is “classify-sklearn”, which is used for pre-fitted
sklearn-based taxonomy classifiers (any of the classifiers available in scikit-learn python
package). You can download a shared pre-fitted classifier; however, it is safer to train yours
and use it for taxonomy assignment. Some pre-fitted naïve bayes classifiers and weighted
taxonomic classifiers are available at the QIIME2 data resources web page at “https://docs.
qiime2.org/2022.2/data-resources/” or any newer release. To train your own taxonomy
classifier, you can use any of the two training methods provided by “q2-feature-classifer”
or “fit-classifier-naive-bayes” to train a naïve bayes classifier or “fit-classifier-sklearn” to
train any arbitrary scikit-learn classifier. An alpha hybrid classifier (VSEARCH + sklearn
classifier) is provided with “classify-hybrid-vsearch-sklearn” method.
7.3.5.1 Using Alignment-Based Classifiers
The alignment-based classifiers use each of representative sequences generated with
clustering or denoising as query sequence to search against the database representative
sequences whose taxa are known. The taxonomy assignment is based on the consensus of
the BLAST hits at percent identity greater than a predetermined identity threshold. The
taxonomy ranks, from the highest to the lowest, are kingdom, phylum, class, order, family,
genus, and species. The confidence on the assignment is the fraction of top hits that match